[GLUTEN-10215][VL] Delta: Native write support for Delta 3.3.1 / Spark 3.5#10801
Merged
dcoliversun merged 10 commits intoapache:mainfrom Oct 15, 2025
Merged
[GLUTEN-10215][VL] Delta: Native write support for Delta 3.3.1 / Spark 3.5#10801dcoliversun merged 10 commits intoapache:mainfrom
dcoliversun merged 10 commits intoapache:mainfrom
Conversation
5836a54 to
c52056e
Compare
|
Run Gluten Clickhouse CI on x86 |
1 similar comment
|
Run Gluten Clickhouse CI on x86 |
afa2b11 to
a2c83c3
Compare
bd64e14 to
44c40e7
Compare
|
Run Gluten Clickhouse CI on x86 |
1 similar comment
|
Run Gluten Clickhouse CI on x86 |
9176a6a to
d821fe8
Compare
|
Run Gluten Clickhouse CI on x86 |
3 similar comments
|
Run Gluten Clickhouse CI on x86 |
|
Run Gluten Clickhouse CI on x86 |
|
Run Gluten Clickhouse CI on x86 |
This will be helpful in measuring the write performance of Gluten's lake format supports
1c3e758 to
0c45bb0
Compare
|
Run Gluten Clickhouse CI on x86 |
FelixYBW
approved these changes
Oct 14, 2025
|
Run Gluten Clickhouse CI on x86 |
Member
Author
Member
Author
|
Thanks for reviewing @dcoliversun @FelixYBW . The performance of this PR is still slow as stats visitor is not offloaded to Velox. But the base write functionality is covered. I'll look into offloading the stats visitor going forward. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
The code is out of PoC from #10216.
The PR adds native Delta write support by offloading native Parquet writer to Velox.
The PR only adds support for Spark 3.5 / Delta 3.3.
A TODO item list is at #10215.
PR relies on #10796 and #10802.
Usage
Set
spark.gluten.sql.columnar.backend.velox.delta.enableNativeWrite=trueto enable the feature. The option is turned off by default.Test
Unit tests added in #10802 are used to test this PR.
UI
With the patch, the offloaded write operations (typically v1 / v2 write commands) will have a
Gluten Deltaprefix on the left hand side of its node name, both in explain and UI.Vanilla Delta commands
Gluten Delta commands
Performance
Slower than vanilla Delta writer as of now.
Performance of the implementation is still to be optimized as we haven't offloaded the stats tracker to Velox native. This is a significant performance overhead.